The effectiveness of position- and composition-specific gap costs for protein similarity searches

نویسندگان

  • Aleksandar Stojmirovic
  • E. Michael Gertz
  • Stephen F. Altschul
  • Yi-Kuo Yu
چکیده

MOTIVATION The flexibility in gap cost enjoyed by hidden Markov models (HMMs) is expected to afford them better retrieval accuracy than position-specific scoring matrices (PSSMs). We attempt to quantify the effect of more general gap parameters by separately examining the influence of position- and composition-specific gap scores, as well as by comparing the retrieval accuracy of the PSSMs constructed using an iterative procedure to that of the HMMs provided by Pfam and SUPERFAMILY, curated ensembles of multiple alignments. RESULTS We found that position-specific gap penalties have an advantage over uniform gap costs. We did not explore optimizing distinct uniform gap costs for each query. For Pfam, PSSMs iteratively constructed from seeds based on HMM consensus sequences perform equivalently to HMMs that were adjusted to have constant gap transition probabilities, albeit with much greater variance. We observed no effect of composition-specific gap costs on retrieval performance. These results suggest possible improvements to the PSI-BLAST protein database search program. AVAILABILITY The scripts for performing evaluations are available upon request from the authors.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

First Record of HAdV-D20 Among Keratoconjunctivitis Patients in Iraq

Background: Human Adenovirus species D (HAdV-D) was common human viral pathogen especially in eye infection, consists of several types of which HAdV-D8, -D19 and –D37 were common in eye infection. This study includes detection of HAdV-D types implicated in conjunctivitis based on L2 (Penton protein) gene similarity. Methods: Conjunctival swabs were collected from Keratoconjunctivitis patient...

متن کامل

Physicochemical Position-Dependent Properties in the Protein Secondary Structures

Background: Establishing theories for designing arbitrary protein structures is complicated and depends on understanding the principles for protein folding, which is affected by applied features. Computer algorithms can reach high precision and stability in computationally designing enzymes and binders by applying informative features obtained from natural structures. Methods: In this study, a ...

متن کامل

Rapid similarity search of proteins using alignments of domain arrangements

MOTIVATION Homology search methods are dominated by the central paradigm that sequence similarity is a proxy for common ancestry and, by extension, functional similarity. For determining sequence similarity in proteins, most widely used methods use models of sequence evolution and compare amino-acid strings in search for conserved linear stretches. Probabilistic models or sequence profiles capt...

متن کامل

Genetic variation of some Iranian Hyoscyamus Landraces based on seed storage protein

The genus Hyoscyamus belongs to the tribe Hyoscyameae Miers of Solanaceae family. Variation in protein bands elaborates the relationship among the collections from various geographical regions. In this study the seed storage protein diversity of 19 accessions of Hyoscyamus (H. niger, H. reticulatus and  H. pusillus) from West Azerbaijan (Iran) was investigate...

متن کامل

Determining specific species and the species contribution in the similarity between soil seed bank and standing vegetation Case study: Lazour rangeland- Firouzkooh

Determining the potential of soil seed bank and its specific species is important for conservation goals and vegetation restoration of rangelands. In this study, the characteristics of soil seed bank and standing vegetation in Lazour mountain rangeland were investigated in order to estimate the rehabilitation ability of the study area in case of possible disturbances. In order to determine the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 24  شماره 

صفحات  -

تاریخ انتشار 2008